Goto

Collaborating Authors

 alternating update




Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms

Neural Information Processing Systems

Matrix Factorization is a popular non-convex optimization problem, for which alternating minimization schemes are mostly used. They usually suffer from the major drawback that the solution is biased towards one of the optimization variables. A remedy is non-alternating schemes. However, due to a lack of Lipschitz continuity of the gradient in matrix factorization problems, convergence cannot be guaranteed. A recently developed approach relies on the concept of Bregman distances, which generalizes the standard Euclidean distance. We exploit this theory by proposing a novel Bregman distance for matrix factorization problems, which, at the same time, allows for simple/closed form update steps. Therefore, for non-alternating schemes, such as the recently introduced Bregman Proximal Gradient (BPG) method and an inertial variant Convex--Concave Inertial BPG (CoCaIn BPG), convergence of the whole sequence to a stationary point is proved for Matrix Factorization. In several experiments, we observe a superior performance of our non-alternating schemes in terms of speed and objective value at the limit point.


Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates

Neural Information Processing Systems

Non-negative matrix factorization is a popular tool for decomposing data into feature and weight matrices under non-negativity constraints. It enjoys practical success but is poorly understood theoretically. This paper proposes an algorithm that alternates between decoding the weights and updating the features, and shows that assuming a generative model of the data, it provably recovers the ground-truth under fairly mild conditions. In particular, its only essential requirement on features is linear independence. Furthermore, the algorithm uses ReLU to exploit the non-negativity for decoding the weights, and thus can tolerate adversarial noise that can potentially be as large as the signal, and can tolerate unbiased noise much larger than the signal. The analysis relies on a carefully designed coupling between two potential functions, which we believe is of independent interest.




Reviews: Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms

Neural Information Processing Systems

The Bregman Proximal Gradient (BPG) method was introduced in general form with theory by Bolte et al. in [8] for nonconvex optimization, and an inertial version with convex-concave backtracking (CoCaInBPG) was introduced by Mukkamala et al. in [45]. In section 6.6 of the same paper, Mukkamala et al. apply CoCaInBPG to the problem of structured matrix factorization and show good performance, but leave the theory and computational efficiency open. Additionally, BPG without inertia was applied sucessfully to the problem of symmetric non-negative matrix factorization (SNMF) by Dragomir et al. in [20], which appears in its most recent version to also include more general (symmetric) matrix completion. The submitted paper here is concerned with applying BPG and CoCaInBPG to the non-symmetric matrix factorization problem, essentially picking up where [45] left off and providing work complementary to [20]. This paper first restates the results of [8,45] for the specific objective (1.1) of matrix factorization, then makes its two primary contributions. First, the paper introduces a kernel generating distance function h that is appropriate for matrix factorization, which is related to the universal function of [20] but extended non-trivially to the non-symmetric case.


Reviews: Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms

Neural Information Processing Systems

The paper introduces a new function generating a Bregman distance and develop Bregman proximal gradient methods applied to various matrix factorization problems. The authors clearly outline the benefits of the approach and computationally demonstrate this.


Reviews: Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates

Neural Information Processing Systems

The recovery guarantee for NMF with mild conditions is very important for the machine learning community. This paper provides a solution under three main conditions as follows. However, it is unclear if this initialization can be achieved. The proof technique seems to be novel, i.e., different from the popular techniques used for alternating minimization of matrix completion or the tensor methods. Therefore, it will probably be useful for the community.


Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms

Neural Information Processing Systems

Matrix Factorization is a popular non-convex optimization problem, for which alternating minimization schemes are mostly used. They usually suffer from the major drawback that the solution is biased towards one of the optimization variables. A remedy is non-alternating schemes. However, due to a lack of Lipschitz continuity of the gradient in matrix factorization problems, convergence cannot be guaranteed. A recently developed approach relies on the concept of Bregman distances, which generalizes the standard Euclidean distance. We exploit this theory by proposing a novel Bregman distance for matrix factorization problems, which, at the same time, allows for simple/closed form update steps.